Gong, Zhenhuan. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (under the Direction of Dr. Nagiza F. Samatova.) Multi-level Data Layout Optimization for Heterogeneous Access Patterns

نویسنده

Nagiza F. Samatova

چکیده

GONG, ZHENHUAN. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (Under the direction of Dr. Nagiza F. Samatova.) Recent years have seen an enormous increase in computation power of leadership computing facilities. As a result, huge amounts of data, from terascale to petascale, are being produced by scientific applications running on supercomputers. However, the I/O subsystems have not developed with a comparable speed, making data I/O and storage the major bottleneck in modern computing architectures. The problem gets exacerbated by the need to perform data-intensive analytic jobs, such as queries with multiple constraints on these datasets stored on external storage. Scientific applications produce multi-dimensional, multi-variate, double-precision datasets, and these datasets are usually stored on large-scale parallel file systems. The datasets are not well-represented by traditional relational data models. Queries on scientific datasets involve multiple constraints, thus producing heterogeneous I/O access patterns. How extreme-scale datasets are linearized and organized on parallel file systems is crucial to the data read performance for queries: the optimized layout results in more sequential reads on contiguous data blocks, which are much faster than seek-and-reads on non-contiguous small blocks. Existing data layout optimization techniques, while successfully improved read performance for certain application-specific access patterns, have failed to address more general and heterogeneous access patterns. They also often do not scale to the expected substantial growth in data size reaching exascale in the near future. Moreover, existing technology usually performs data layout optimization in a post-processing way on datasets on storage systems. The post-processing approaches read the entire datasets on storage, perform layout optimization, and write the processed data back to storage, which is extremely inefficient due to the I/O bottleneck in modern computer architecture, and the huge size of the datasets. There is a lack of a general framework to perform data layout optimization for scientific datasets at simulation run time or I/O time, before datasets are written to storage, to reduce I/O and storage overhead, and speed up the entire process. To address the problems, a multi-level data storage layout scheme is presented, which optimizes for heterogeneous access patterns induced by different types of queries for scientific data analysis. First, a hybrid layout scheme is presented, which is optimized for two common access patterns: value-constrained accesses and space-constrained accesses. The layout scheme improves data locality and reduces the latency bound I/O operations (such as seeks) substantially. This layout scheme is further generalized by MLOC, a parallel M ultilevel Layout Optimization framework for C ompressed scientific spatio-temporal data at extreme scale. MLOC includes multiple fine-grained data layout optimization kernels that form a generic core from which a broader constellation of such kernels can be organically consolidated in a hierarchical multilevel architecture, in order to enable an effective data exploration with various combinations of access patterns. Specifically, the kernels are optimized for access patterns induced by (a) query-driven multi-variate, spatio-temporal constraints, (b) precision-driven data analytics, (c) compression-driven data reduction, (d) multi-resolution data sampling, and (e) multi-file data partitioning and organization on parallel file systems. When tested on query-driven exploration of compressed data, MLOC demonstrates an order of magnitude faster query response time compared to the state-of-the-art scientific database management technology. Based on MLOC, a parallel run-time data layout optimization framework, PARLO, is presented to perform data layout optimization on scientific datasets at simulation run time, By processing the datasets when they are still in memory before written to disks, PARLO successfully removes additional I/O and storage overhead, and significantly reduces overall processing time compared with traditional post-processing approaches. PARLO is integrated with ADIOS, a popular parallel I/O middleware. With support of ADIOS’s I/O libraries and the portable BP file formats, PARLO achieves high-performance, parallel, and user-transparent data layout optimization at run time. c © Copyright 2013 by Zhenhuan Gong

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Data Layout Optimization of Scientific Data through Access-driven Replication

Efficient I/O on large-scale spatio-temporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these ...

متن کامل

RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication

Efficient I/O on large-scale spatiotemporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these f...

متن کامل

Maximum Maintainability of Complex Systems via Modulation Based on DSM and Module Layout.Case Study:Laser Range Finder

The present paper aims to investigate the effects of modularity and the layout of subsystems and parts of a complex system on its maintainability. For this purpose, four objective functions have been considered simultaneously: I) maximizing the level of accordance between system design and optimum modularity design,II) maximizing the level of accessibility and the maintenance space required,III...

متن کامل

Data layout optimization for multi-valued containers in OpenCL

Scientific data is mostly multi-valued, e.g., coordinates, velocities, moments or feature components, and it comes in large quantities. The data layout of such containers has an enormous impact on the achieved performance, however, layout optimization is very time-consuming and error-prone because container access syntax in standard programming languages is not sufficiently abstract. This means...

متن کامل

A Compile-Time Data Locality Optimization Framework for NUCA Chip Multiprocessors

With increasing numbers of cores, future CMPs (Chip MultiProcessors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. For data-parallel programming models, there is a mismatch between such a non-uniform cache organization and the canonical row-major or column-major layouts of multi-dimensional arrays...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Gong, Zhenhuan. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (under the Direction of Dr. Nagiza F. Samatova.) Multi-level Data Layout Optimization for Heterogeneous Access Patterns

نویسنده

چکیده

منابع مشابه

Parallel Data Layout Optimization of Scientific Data through Access-driven Replication

RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication

Maximum Maintainability of Complex Systems via Modulation Based on DSM and Module Layout.Case Study:Laser Range Finder

Data layout optimization for multi-valued containers in OpenCL

A Compile-Time Data Locality Optimization Framework for NUCA Chip Multiprocessors

عنوان ژورنال:

اشتراک گذاری